Behavioral Differences Between Normal 


and Traumatized Newborns: 


Il. Standardization, Reliability, and 
Validity 


By 


Frances K. Graham 
Washington University 


Ruth G. Matarazzo 


Harvard University 


and 


Bettye M. Caldwell 


Washington University 


No. 428 Vol. 70 
1956 No. 21 


Voi. 70, No. 21 


Whole No. 428, 1956 
Psychological Monographs: General and Applied 


Behavioral Differences Between Normal and 
Traumatized Newborns: 


II. Standardization, Reliability, and Validity’ 


Frances K. GRAHAM 
School of Medicine, Washington University 


Rutu G. MATARAZZO 
School of Medicine, Harvard University 


AND Bettye M. CALDWELL 
School of Medicine, Washington University 


N AN effort to measure behavior which 
would differentiate a group of in- 
fants who were “normal” from those who 
might be diagnosable as cases of “brain 
injury,” the five test procedures de- 
scribed in the preceding paper (7) were 
developed. The effectiveness of these pro- 
cedures is reported in the present paper. 


SUBJECTS 


Our subjects (Ss) were full term in- 
fants born on the inpatient service of 
the St. Louis Maternity Hospital during 
the period from July, 1953 through 
October, 1955.7 On the basis of a sample 


* This investigation is part of a long-term proj- 
ect being carried out in collaboration with Drs. 
Alexis F. Hartmann and Miriam M. Pennoyer. 
The work was initiated by a four-month grant 
from the Frances Israel Fund of the Noshin 
Rachmonioth Society of St. Louis, Mo. From 
Nov. 1, 1953 to Oct. 31, 1954 it was supported 
by the Children’s Research Foundation, St. 
Louis, Mo., and since Nov. 1, 1954 by a research 
grant B685 from the National Institute of Neu- 
rological Diseases and Blindness, of the National 
Institutes of Health, Public Health Service. 

* The writers are indebted to the authorities 
of the St. Louis Maternity Hospital and to the 
obstetricians and pediatricians who provided 
the opportunity for carrying out the study. We 


from a private, teaching hospital, it is 
not possible to generalize about the fre- 
quency with which certain kinds of ad- 
normality will occur in the population 
as a whole. It is probable that both socio- 
economic level and techniques of pre- 
natal and postnatal care reduced the 
amount of abnormality in our group. 
This is not a serious limitation in inter- 
preting our results, however, since we are 
not concerned with absolute frequencies 
but with relative differences between nor- 
mal and traumatized Ss. Whatever differ- 
ences we could demonstrate between nor- 
mal and traumatized groups would un- 
doubtedly be more striking in the popu- 
lation as a whole where abnormality 
would be more marked. 

The traumatized group was composed 
of almost the total population of trau- 
matized infants born during the course 
of the study, with the exclusion of in- 
fants who were overlooked, and of a few 
infants whom we could not obtain per- 
mission to examine. We were informed 


should like to express special gratitude for the 
cooperation of the nursing staff and especially 
of Miss Margaret Weber. 


17 


18 GRAHAM, MATARAZZO, AND CALDWELL 


TABLE I 


CLASSIFICATION OF TRAUMATIZED SUBJECTS 
ACCORDING TO KIND AND DEGREE 
oF TRAUMA 


Degree of Trauma 
Mild Moderate Severe 


Kind of Trauma 


Anoxia 21 26 
Mechanical trauma ° ° 
Infections or diseases* 3 12 

Total NV 24 38 


* Erythroblastosis fetalis, hypoglycemia, men- 
ingitis. 


by the pediatric staff of all infants who, 
either at birth or subsequently, might be 
classified as abnormal. Infants were 
tested, if possible, within 24 hours after 
birth or as soon after that as their condi- 
tion permitted. Only infants seven days 
old or younger are included in the study. 

The kinds and degree of trauma pres- 
ent in the abnormal group are shown in 
Table 1. A pediatrician experienced in 
the neonatal field made the classifications 
without knowledge of psychological test 
results. A second pediatrician also rated 
25 of the same cases.* Pediatrician One 
consistently rated cases more severely 
than Pediatrician Two, but the extent of 
agreement was high and significant as 
estimated by a correlation ratio of .86. 
It is important to emphasize that there is 
no reason to expect that all traumatized 
Ss would sustain actual brain damage and 
that those classified as “mild” are con- 


*We are grateful to Drs Miriam Pennoyer 
and Marshall Greenman, of the Newborn and 
Premature Service of the St. Louis Maternity 
Hospital, for assistance in this phase of the 
study. Originally, they were asked to group the 
cases into as many discriminable ranks as possi- 
ble, but their judgments were uncertain when 
more than three ranks were attempted. These 
three ranks were roughly identified as mild, 
moderate, and severe trauma or good, question- 
able, and guarded prognosis. The latter descrip- 
tion is probably preferable and has_ been 
adopted in subsequent work. 


sidered by both pediatric judges to have 
good prognoses. 

The normal group was composed of 
infants without prenatal, perinatal, or 
postnatal complications. Cases were not 
included where there was maternal bleed- 
ing during pregnancy or serious ma- 
ternal illnesses such as rubella, diabetes, 
hypertension, etc. Perinatal circum- 
stances were considered satisfactory when 
delivery was spontaneous or by low for- 
ceps, respiration and cry were established 
in a few seconds, and the infant was ac- 
tive. Infants were not tested on the day 
of circumcision or with elevated tem- 
peratures. Sufficient numbers were tested 
on each of the first five days of life so 
that the effect of age could be either 
statistically weighted or controlled by 
pairing normal with traumatized Ss. 

The primary requisite in selecting a 
normal control group is that it shall not 
differ from the experimental group ex- 
cept in regard to the independent vari- 
able. Unfortunately, when inde- 
pendent variable is not under the con- 
trol of the experimenter, there is danger 
that characteristics inherent in the § may 
be associated with both the independent 
and dependent variables and thus con- 
found the results. There is no certain 
way of avoiding this, and the best thar 
can be done is to control for such char- 
acteristics as experience indicates may be 
significant. In this instance, there is 
reason to think that prematurity, sex, 
race, and socioeconomic status are associ- 
ated with the occurrence of trauma. 
However, unless they are also related to 
the dependent variable, i.e. test perfor- 
mance, the results will not be confounded 
and there will be no advantage to match- 
ing or pairing Ss or otherwise controlling 
for possible effects. 

Preliminary results 


indicated that, 


II 
5 
19 


NORMAL AND TRAUMATIZED NEWBORNS 1g 


with the exception of prematurity, these 
characteristics were not related to most 
of our tests. We therefore decided to test 
full term normal infants, without any 
predetermined selection criteria other 
than those determining classification as 
normal, and then to test statistically for 
relationships between S characteristics 
and test performance and to control such 
relationships as were found to exist. 
The exact procedure used in selecting 
Ss was determined by the supply avail- 
able in the nursery on any given day. 
Because of the scarcity of traumatized 
Ss, a priority system was set up. Any 
traumatized Ss available were examined 
first. No selection was involved here, as 
all such Ss with the exceptions noted 


above were examined. Next, all infants 


on whom blood oxygen saturation tests 
had been made were examined. At in- 
tervals during the period of study, blood 
samples were obtained from all infants 
born during the hours the research 
pediatrician was on duty. This was an 
unselected group of mixed normals and 
abnormals. All of these infants were 
tested at least for their pain thresholds, 
but the results of prematures, infants 
delivered without difficulty by section, 
and otherwise normal infants whose 
mothers had accidents or serious illnesses 
during pregnancy are not included in the 
present report. They do not meet the 
criteria for classification as either normal 
or traumatized. Finally, if time per- 
mitted, normal Ss were selected from a 
survey of the hospital charts. It had been 
planned originally to obtain all Ss for 
the normal standardization group from 
those on whom oxygen tests had been 
made. However, as there were a number 
of interruptions in carrying out the oxy- 
gen-testing program, it became necessary 
to find additional Ss. The psychologist 


looked through the hospital charts and 
selected the first infant who met the 
criteria for normality. Negro and white 
patients and, to some extent, private and 
clinic patients, were located on different 
floors of the hospital, so that an initial 
choice as regards these characteristics 
had always to be made. White clinic pa- 
tients were most easily available to us 
and constitute the largest proportion of 
the control group. Negro and _ private 
patients were obtained only in sufficient 
numbers to test the relationship of these 
variables to test measures. 

Since this was an exploratory as well 
as a standardization study, during which 
we not only were in the process of de- 
veloping the techniques to be reported 
but also of eliminating others, not all the 
tests reached their final form at the same 
time. As soon as a method was in final 
form, we administered it to a sufficiently 
large number of normal Ss to provide 
standardization data on reliability and 
age changes, and it was subsequently ad- 
ministered only to traumatized Ss and 
those on whom oxygen tests had been 
made. Thus all tests were not adminis- 
tered to all Ss or to an equal number of 
Ss. Table 2 shows the distribution of 
socioeconomic status, sex, and race in the 
normal and traumatized Ss who were 
given each test, as well as in the total 
groups. Table 3 gives the age distribu- 
tion for initial tests and retests. 


PROCEDURE 


The Ss were examined in a hospital 
room maintained in the same manner 
as the regular nursery. No soundproof- 
ing was available in this room. When- 
ever extraneous noises were sufficiently 
loud to startle an S, test procedures were 
repeated. Examinations were carried out 
between 10:15 A.M. and 3:00 P.M. with 


GRAHAM, MATARAZZO, AND CALDWELL 


TABLE 2 
Soc1oEconomic STATUS, RACE, AND SEX OF NORMAL AND TRAUMATIZED SUBJECTS 


Sex Socioeconomic Race 
Status 
Test Total Ss 
Male Female Private Clinic Negro White 


Pain: 
Normal — 39 5 33 63 84 
Traumatized 34 


Maturation Scale: 
Normal 
‘Traumatized 


Vision Scale: 
Normal 
Traumatized 


Irritability: 
Normal 
Traumatized 


Tension- 
Normal 
Traumatized 


Total Ss 
Normal 
Traumatized 


TABLE 3 


AGE DISTRIBUTION OF INITIAL TEsTs (I) AND RETESTS (R) GIVEN TO NORMAL 
AND TRAUMATIZED SUBJECTS 


Day 3 


Day st 


I R R 


Normal 61 . 45 
Traumatized 46 


Maturation Scale: 
Normal 30 
Traumatized 13 


Vision Scale: 
Normal 6 20 
Traumatized 5 23 


Irritability : 
Normal 5 28 
Traumatized 5 20 13 


Tension: 
Normal 15 6 17 103 28 
Traumatized 17 2 2 4 4 29 12 


* Includes Ss from 12 to 36 hours old with the exception that 15 normal and 3 traumatized Ss were 
given pain tests when less than 12 hours old. 
¢ Includes traumatized Ss from 5 to 7 days old. 


20 
96 
55 
22 6 16 12 7 21 28 
23 14 20 17 10 27 37 
18 II 12 17 10 19 29 
17 12 II 18 II 18 29 
55 26 45 30 23 58 81 

Day 1* Day 2 7 Day 4 P| Total 
I R I R ee 
: 


NORMAL AND TRAUMATIZED NEWBORNS 21 


most Ss seen during the morning. As 
pointed out previously, the number and 
kind of tests given varied during the 
course of the study. For those Ss who 
were tested with the final battery, the 
pain threshold was obtained first. Vision 
tests were given whenever the infant 
opened his eyes, and the maturation and 
tension scale items were given in what- 
ever order best maintained the infant 
in a satisfactory state. Irritability was 
rated at the end of the examination. De- 
tailed instructions for each of the pro- 
cedures is given in the preceding paper 


(7). 


RESULTS* 
Reliability 

In the present research, we are in- 
terested in three levels of measurement 
which require different methods of esti- 
mating reliability. In the first place, we 
are interested in differentiating groups. 
The reliability of the group measure- 
ments would be satisfactory if the error 
variance of the group means were signifi-’ 
cantly less than the differences between 
normal and traumatized groups. This 
will be dealt with in a later section. 

Secondly and primarily, we are con- 
cerned with the accuracy of identify- 
ing individuals as members of a group. 
Satisfactory reliability for this purpose 
would be achieved if errors in measure- 
ment were not large enough to change 
the individual’s classification as normal 
or abnormal. 

Thirdly, we are interested in the reli- 
ability of individual raw scores, partly be- 
cause there are standard methods of pre- 
senting reliability data in this form and 
partly because, even where the interest is 

*The authors are grateful to Robert C. Bilger 


and to John C. Glidewell for advice on statistical 
treatment. 


centered on classifying in two categories, 
there is always the possibility that raw 
scores or transformations of them can 
later be shown to measure degrees of ab- 
normality or normality. We will, there- 
fore, present reliability measures in terms 
of raw scores, although raw-score reliabil- 
ity will underestimate the reliability of 
classifying as normal or abnormal. It is 
only for this twofold classification that 
validation has been attempted. Three es- 
timates of reliability have been made: 
(a) single-session reliability, given by 
split-half product-moment correlations, 
(b) test-retest agreement after 24 hours, 
and (c) interscorer agreement. 

1. Split-half reliability. Only the pain 
threshold score could be satisfactorily 
divided into two comparable halves. 
Split-half correlations (ist and end 
halves) for normal and traumatized Ss 
are shown in Table 4. The correlations 
range from .82 to .g7. Reliability is high- 
est for the traumatized group, but this 
is undoubtedly due to the wider range 
of thresholds obtained in this group. 
When correlations are calculated sepa- 
rately for traumatized Ss with thresholds 
in the same range as the normal group, 
the difference in reliability disappears. 
Age heterogeneity apparently con- 
tributes little to raising the correlation. 
When age is partialled out, the correla- 
tion drops only from .87 to .82. It should 


TABLE 4 


SpLit-HALF RELIABILITY OF THE PAIN 
THRESHOLD 


r (cor- 
rected) 


Subjects 


Normal Ss .87 
Normal Ss (partial r without 
age covariance) .82 
Traumatized Ss (total) -97 
with normal scores .82 
with above normal scores -93 


N 


22 GRAHAM, MATARAZZO, AND CALDWELL 


be pointed out that momentary fluctua- 
tions in the state of an infant are much 
more marked than in an adult and a 
split-half reliability estimate includes cor- 
respondingly more of the possible vari- 
able error. 

2. Test-retest reliability. Test-retest 
correlations are reported only for nor- 
mal Ss. Since age is a variable rather 
than a source of error in the present 
study, changes associated with it should 


not be included in a measure of un-~ 


reliability, unless the effect of age is rela- 
tively the same for all Ss. The effect of 
age is undoubtedly not the same for all 
infants in the traumatized group, as they 
are recovering at different rates from dif- 
ferent kinds and degrees of trauma and 
there is no way of separating the differ- 
ential changes due to age from those due 
to unreliability. Therefore test-retest 
correlations in this group could give lit- 
tle information on reliability. As far as 
normal Ss are concerned, we do not 
know whether they change differentially 
with age, although recovery from even 
a normal birth probably varies. How- 
ever, there can be no objection to re- 
garding test-retest correlation as at least 
a minimum estimate of reliability. 
Table 5 shows the number of Ss who 
were re-examined on each procedure and 
the agreement of test and retest score. 
Product-moment correlations were cal- 
culated for Pain Threshold and for 


TABLE 5 


TEsT-RETEST AGREEMENT AFTER 
Twenty-Four Hours 


Test 


Normal Ss Agreement 


Maturation and Vision Scale scores. Per- 
centage of perfect agreement was used 
in the case of the rating scales. Retest 
reliability is lower than split-half, as 
would be expected, but it is satisfactory 
for the kind of tests used, especially 
in view of the age changes which are sig- 
nificant at this time of life. 

3. Interscorer reliability. On all meas- 
ures except the Pain Threshold, scores 
depend to some extent upon judgments 
made by the examiner. In order to de- 
termine whether such judgments con- 
stituted a major source of unreliability, 
two examiners simultaneously scored a 
sample of Ss, drawn at intervals through- 
out the course of the study, to insure 
that agreement did not vary over a 
period of time. The examiner who ad- 
ministered the tests had no knowledge 
of whether Ss were traumatized or not 
traumatized. Table 6 shows the number 
of Ss in each group observed by two ex- 
aminers and the satisfactorily high in- 
terscorer agreement. As in the case of 
test-retest agreement, Tension ratings ap- 
pear more reliable than those for Irrit- 
ability. 

Objectivity 

Ideally, in such a study as this, all 
measurements should be made without 
knowledge of how the § is classified. It 


was not possible to do this routinely, but 
there are several lines of evidence which 


TABLE 6 
INTERSCORER AGREEMENT 


rauma- 


Normal tized 


Test 


Agreement 


Pain 24 
Maturation Scale 20 
Vision Scale 20 
Irritability 28 


r= .69 
r=.62 
r=.62 

75% perfect 
agreement 
86% perfect 
agreement 


Tension 28 


Maturation 


Scale r= .97 
Vision Scale 
Irritability 


r= .90 
68% perfect 
agreement 
79% perfect 
agreement 


Tension 


NORMAL AND TRAUMATIZED NEWBORNS 23 


indicate that we were successful in avoid- 
ing bias. 


1. While there was often knowledge of the 
presence or absence of trauma, whether it was 
mild, moderate, or severe was not known. As 
will be subsequently shown, test results not only 
differentiated’ normal from traumatized Ss but 
also were correlated with the degree of trauma. 

2. Interscorer reliability was high (see Table 
6) for Ss scored by two examiners simultane- 
ously. The examiner who administered the tests 
on these check cases was without knowledge of 
the S's classification. 

3. A group of 16 traumatized and 31 normal 
Ss was examined without any knowledge of 
classification. The 16 traumatized Ss had been 
given a total of 41 tests and the 31 normal Ss 
a total of 68 tests. The scores of each S on the 
tests he was given were paired with the scores 
of an § tested with knowledge of classification. 
Pairing took into account age, race, and kind 
and degree of trauma (when present). The 
Sign Test was used to test the hypothesis that 
the score differences obtained under the two 
conditions were randomly distributed as to direc- 
tion. For this analysis, scores on all tests were 
considered simultaneously. Results showed that 
there were no significant differences in either 
the normal or traumatized Ss to suggest that 
scores were more “normal” or more “abnormal” 
when there was knowledge of the classification of 
the infant than when there was not such knowl- 
edge. Since bias might occur more easily on 
some of the more subjective tests than on others, 
F tests were applied to each test separately. Of 
the 10 comparisons made, normal and trauma- 
tized groups being treated separately, only one 
indicated a difference significant at the .05 
level. That was on the Tension Rating scale, 
where 14 normal Ss tested with knowledge were 
rated lower than those tested without knowledge. 


Variables Affecting the Performance 


The normal and traumatized groups 
differ considerably in respect to age, sex, 
race, and private or clinic status, as may 
be seen from Tables 2 and 3. Before the 
performance of the two groups could be 
compared, therefore, it was necessary to 
determine whether or not these charac- 
teristics were related to the scores ob- 
tained by normal Ss. The existence of a 
relationship in traumatized Ss which did 
not exist in normal Ss would not con- 
found the results but would be assumed 


to be an interaction effect dependent 
upon trauma. 


In order to determine the effect of so many 
variables without allowing concomitant varia- 
tion, it was necessary to select smaller samples 
from the pool of Ss in such a way that factors 
other than the one being tested were held con- 
stant. Male and female Ss, paired for race, socio- 
economic status, and age, did not show differ- 
ences on any of the tests. Private and clinic Ss, 
paired for race and age but with sex allowed to 
vary since it did not influence performance, 
also did not differ from one another on any 
of the tests. The effects of race, age, retesting 
and condition of the Ss were similarly tested. 
The F ratio was used to compare scores on the 
Pain Threshold test and on the Maturation and 
Vision Scales. Chi square was employed in com- 
paring scores on the two rating scales. The rat- 
ings were categorized as “o” or “not o” in 
order to obtain sufficiently large theoretical fre- 
quencies but, even so, the expected frequencies 
were less than 5 in two of the eight compari- 
sons. However, none of the probabilities cal- 
culated from the chi-square distribution ap- 
proached significance and would not do so even 
if calculated directly. 

Private-clinic status and sex of the S did not 
measurably affect performance on any of our 
tests, and there was no practice or learning effect 
from retesting when age was held constant. 
However, age itself was significantly related to 
score on three tests. Older Ss were found to be 
more sensitive than younger to Pain, and to 
perform better on the Maturation and Vision 
Scales. Negro Ss were also superior to whites on 
the Maturation and Vision Scales but there was 
no difference on the other tests. Unless the con- 
dition of Ss being given the Maturation Scale 
was “satisfactory,” as previously defined, signifi- 
cantly poorer scores were obtained. This factor 
was not important, in so far as it was measured, 
on other tests. Where we have made assertions 
that a factor did affect performance, the usual 
5 or 1 per cent levels of significance of differ- 
ence are to be understood. Where the statement 
is made that a factor did not affect performance, 
the F ratios were in every case smaller than 1.0, 
and the highest chi-square probability was .18. 
There was, therefore, nothing to suggest that an 
effect would be demonstrated if another sample 
or a larger one were obtained. 


Differences Between Normal and Trau- 
matized Groups 


Mean difference. In order to com- 
pare the performance of normal and 
traumatized groups on the test battery, 


| 


24 


it was necessary to equate the groups for 
those variables which have been shown 
to be related to test score. As neither 
the irritability nor tension ratings were 
affected by the variables tested, no pair- 
ing was necessary on these measures. The 
entire sample of normal and traumatized 
Ss could be used, therefore, with the 
exception of those rated on irritability 
a day or two after circumcision. Statisti- 
cal analysis, not reported, showed that 
this factor did raise the irritability rat- 
ing. Table 7 shows the size of the groups, 
the variables on which they were equated 
by pairing, the mean scores of the two 
groups, the statistic used in estimating 
probabilities, and the probability that 
differences between groups are due to 
chance. The means are included on all 
five measures, although they were of 
course not used when chi square was. the 
comparison statistic. On both the pain 
thresholds and the Vision Scale there 
was a significant difference in the vari- 
ance of the traumatized and normal 
groups. In evaluating the mean differ- 


GRAHAM, MATARAZZO, AND CALDWELL 


ence, therefore, ¢ was calculated accord- 
ing to the Cochran-Cox method with 
no assumptions about variance. On all 
tests the performance of the traumatized 
groups was significantly poorer than that 
of the normal groups. 

Shape of the distributions. In the pre- 
ceding section, evidence was presented 
that the normal and traumatized groups 
differed significantly, but nothing was 
said about the shape of the distributions. 
Figure 1 shows the distribution of pain 
thresholds in the two groups. Percent- 
age frequency rather than frequency is 
plotted on the ordinate to equate the 
size of the groups. Standard scores are 
plotted on the abscissa so that thres- 
holds of individuals tested on different 
days could be combined, Since the stand- 
ard score transformation is a linear trans- 
formation, the shape of the curve is al- 
tered only to the extent that the distribu- 
tions of the several days show differ- 
ences in skewness. Standard scores of 
traumatized Ss are based on the normal 
group. - 


TABLE 7 
COMPARISON OF NORMAL AND TRAUMATIZED GROUPS ON FIVE TESTS 


Variables 
Controlled 


Test 


Comparison 
Statistic 


Mean 
Scores 


Pain Threshold 
Normal 


Age 
Traumatized 


165 
270 


t test 


Maturation Scale 
Normal 


Age, race 
Traumatized 


F test 


Vision Scale 
Normal 


; Age, race 
Traumatized 


Irritability 
Normal 
Traumatized 


Chi square 


Tension 
Normal 
Traumatized 


Chi square 


= 
| 

10.6 
4.2 
29 
29 -48 


NORMAL AND TRAUMATIZED NEWBORNS 


Confidence Limits 


Traumatized Ss 


Percentage Frequency 
ine) 


@ Cutting Point at Po, 


Normal Ss 


+ 75 60 45 


3. +0- 


Pain Thresholds (z scores) 


Fic. 1. 


The graph shows a skewed distribu- 
tion in the normal group, the skewness 
probably reflecting the effect of a lower 
limit to the pain threshold. Thresholds 
of the traumatized group, on the other 
hand, cover a much wider range and 
do not appear to fall into any type 


of unimodal distribution. Unless it is as- 
sumed that the considerable variability 
has distorted what would otherwise be 
a unimodal distribution, we must as- 
sume a bi- or multimodal distribution. 
Since the tendency to bimodality is also 
present in each of the three subsamples 
of mild, moderate, and severely trau- 
matized infants, the hypothesis of bi- 
modality is strengthened. 

If thresholds of traumatized Ss are not 
unimodally distributed, what does this 
signify? It can only mean that pain sen- 
sitivity is not affected in a unitary fash- 
ion under conditions of trauma. More 
than one factor must be present. It is 
possible that the presence of a new fac- 
tor is due to the greater intensity of 
stimulation used with some of the trau- 
matized Ss, rather than to the trauma per 
se. This would be the case if other sense 
modalities were activited when a certain 


Pain thresholds (in standard score form) of normal and traumatized subjects. 


intensity of stimulation is reached. It is 
tempting, however, to speculate that the 
new factor is related to brain function- 
ing. Is there a threshold for impairment 
of brain functioning such that on one 
vide of the threshold, the same factors de- 
termining pain sensitivity in normal 
brains are operating while, once beyond 
the threshold, sensitivity is determined 
by changed condition of the brain? It is 
idle to speculate on the brain physiology 
which might be involved; there are 
many physiological phenomena which 
show this all-or-none character. For pres- 
ent purposes, the point is of interest in 
selecting cutting scores and making pre- 
dictions. If the nonunitary character of 
pain sensitivity is due to changes in 
brain functioning, we would expect to 
find that Ss who later show evidence of 
brain damage would be selected from 
among those who form a second mode 
and not from among those who fall at 
the upper end of the normal distribu- 
tion. 

Graphic distributions of the other 
four tests are not presented. Interpreta- 
tion of them is complicated by the fact 
that scoring was empirically determined 


7 \ \ 
\ 


26 GRAHAM, MATARAZZO, AND CALDWELL 


on the basis of observation by the au- 
thors, and changes in the scoring system 
would, of course, change the character of 
the distributions. The distribution of 
Maturation and Vision Scale scores was 
similar to that for Pain Threshold in 
both normal and traumatized groups. 
On the Irritability and Tension ratings, 
the normal groups show a heavy con- 
centration of scores receiving o rating 
with a rapid falling off of the curve. 
There is little tendency for frequencies 
to pile up at the tail as in a J curve. The 
traumatized groups, however, do shew an 
increased frequency of higher ratings as 
well as a wider range. 


Cutting Points and Normative Data 


In order to identify those Ss among 
whom we expect to find later evidence 
of brain damage, it is desirable to es- 
tablish a cutting point. It would be pos- 
sible to correlate scores on each of the 
tests with later results. However, since 
we are dealing with a phenomenon of 
low frequency in the total population, it 
is likely that the percentage of “hits” 
can be increased by restricting ourselves 
to a more eligible subsample. If the bi- 
modal distribution of scores in the trau- 
matized group is related to brain func- 
tion, it becomes even more important to 
separate the population forming the 
second mode from that forming the first 
mode, since there would be no reason 
to expect differences in scores around the 


normal mode to be correlated with later 
damage. 


It is also desirable to have a cutting 
point which will include a minimum of 
false positives, i.e., normal Ss incorrectly 
called abnormal. From the application of 
Bayes’s theorem, which Meehl has made 
(12), it is easily demonstrated that when 


TABLE 8 
PAIN THRESHOLDS IN VOLTS OF NORMAL 


SUBJECTS FOR THE First FIVE Days 
OF LIFE 


Day 
Statistic 


ag I 2 § + 5 


N 1§ 54 120 18 17 
Mean 205 185 155 120) 
SD 70 68 50 35.35 


p=.o1 380 350 280 195 183 


a clinical group occurs with low fre- 
quency in the general population, there 
will be more incorrect than correct pre- 
dicitions even when the percentage of 
true positives approaches 100, unless 
false positives are kept at a minimum. 
These considerations led us to select 
a cutting point at the extreme of the 
normal distribution—that point below 
which only 1 per cent of the normal 
population would fall. Cutting points 
for pain thresholds, Maturation, and 
Vision scores were set at ¢ values with 
probabilities at the .og level (or .o1 level 
for a single tail), rather than at the 
observed p= .01 value in the sample. 
Basing cutting points on ¢ values, when 
justified by an approximately normal 
distribution, appears preferable to using 
sample percentile points. Separate cut- 
ting points were determined for each 
day, and, in the case of Maturation and 
Vision scores, for both Negroes and 
whites. These cutting points as well as 
the means and standard deviation are 
shown in Tables 8, g, and 10.5 The 
reliability of the cutting points for pain 
thresholds can be estimated from the 
95 per cent confidence limits, which are 
shown in Fig. 1. Bars rather than lines 


° Because race and age did affect performance 
they required separate norms. An additional 
89 Ss were therefore tested in order to increase 
the size of N on the Maturation, Vision, Ir- 
ritability, and Tension scales. 


— 


NORMAL AND TRAUMATIZED NEWBORNS 4 | 


TABLE 9 


MATURATION SCALE SCORES OF NORMAL 
SUBJECTS FOR THE First FIVE Days 
OF LIFE 


Statistic 


White Ss: 
N 


Mean 


identify the upper and lower confidence 
limits since this is a composite curve of 
the samples on each of the five days. The 
limits are sufficiently narrow so that 
false positives would vary only between 
o and g per cent if the cut is located at 
any point within the interval. The num- 
ber of true positives could vary more 
widely, but, even at the upper limits, 
none of the cases at the second mode 
is excluded. The confidence intervals 
for cutting points on the Vision Scale 
are also narrow. False positives would 
vary only from o to 2 per cent and true 
positives from 38 to 43 per cent. The 


TABLE 10 


Vision SCORES OF NORMAL SUBJECTS FOR THE 
First Five Days oF LIFE 


Statistic 


White Ss: 


Negro Ss: 
M ean 
SD 


p=.o1 


Maturation Scale shows less discrimina- 
tion than the other measures to begin 
with. While changes in the location of 
the cut would not increase the number 
of false positives, the discrimination of 
true positives could be cut from 25 to 7 
per cent. 


In determining cutting points for the 
Irritability and Tension Scales, it was 
necessary to use the observed p values 
since the distributions depart radically 
from the normal curve as can be seen 
from Table 11. However, ties in score 
(at a rating of 1.0) occurred in the lower 
6 per cent of Irritability Ratings and 3 
per cent of Tension Ratings so that a 
value unique to the first percentile could 
not be determined. While it is possible 
to divide the tied cases into appropriate 
proportions above and below the cutting 
point, this is not satisfactory when a de- 
cision must be made about classifying an 
individual $. The cut must be placed 
either just below or just above the score. 
The decision was made to place it above 
the rating of 1.0 on the Irritability Scale, 
thus excluding gg.5 per cent of normal 
Ss, and below the 1.0 rating on the 
Tension Scale, which excluded only 96.8 
per cent of normals. These decisions 
took into account the shape of both the 
normal and traumatized group distribu- 
tions and may to some extent have capi- 
talized on chance fluctuations. Estimates 


TABLE 11 


PERCENTAGE OF NORMAL SUBJECTS RECEIVING 
A GIVEN IRRITABILITY OR TENSION RATING 
ScorE (N= 186) 


Rating Score 


Irritability 
Tension 


Day 
XS 
2 3 4 5 
B | 37 21 20 21 20 
| 14.8 143.5 14.2 
SD ‘2 2.2 2.2 
p=.or 3.9 6.9 84 
Negro Ss: 
7 N 28 20 
Mean 12.4 13.7 
i SD 3.0 2.6 
p=.o1 4.9 7.0 
Day 
I 2 3 4 5 
N 37 26 23 20 24 
Mean 4.9 6.5 6.4 6.2 6.9 
SD 22 £24 2.5 5.9 
Scale 1.25 
: 23 21 above 
5.4 7.6 
2.0 2.0 76.3 $.9 16:2 5.6 
2.6 86 7.5 2.3 @.5 


28 GRAHAM, MATARAZZO, AND CALDWELL 


of the reliability of these cuts are neces- 
sarily crude. If g5 per cent confidence 
limits based on the binomial distribu- 
tion are determined, we can expect the 
percentage of false positives on the Ir- 
ritability Scale to vary from o to 4 per 
cent and the percentage of false positives 
on the Tension Scale to vary from 1 to 
7 per cent. 

These cutting points are only tenta- 
tively identified. The optimal cut, in the 
sense of giving minimal overlap in both 
directions, lies at the intersection of the 
normal and traumatized group distribu- 
tions. For the reasons outlined above, 
we did not feel that such a cut would be 
useful in a situation where base rates 
are presumably low and where it is 
more important to exclude false posi- 
tives than to reduce false negatives. Final 
decisions about the value of these cut- 
ting points depends upon their relation- 
ship to signs of brain damage in the Ss’ 
subsequent development. For this rea- 
son, and because the points are deter- 
mined entirely by the normal samples, 
which are reasonably large and reliably 
measured, we did not reserve a portion 
of the data for cross validation. 

A rough measure of the relative dis- 
criminating power of the five tests may 
be obtained by comparing the percent- 


age of traumatized Ss called abnormal. 
Table 12 shows the percentage of Ss who 
score on the abnormal side of the cut- 
ting point on any one or more tests and 
on each test separately. When Ss were re- 
tested, the poorest performance has been 
taken as the score on a test. These data 
are supplied for the normal and trau- 
matized groups and for the three sub- 
samples of traumatized Ss. Pain thresh- 
olds and the Vision Scale are superior, 
but all tests identify some Ss as ab- 
normal. The percentage identified as ab- 
normal appears to increase with the de- 
gree of trauma and, if scores on all tests 
are considered, is statistically significant 
at the .o1 level when tested by chi 
square. 


Intercorrelations 


How are the various tests related to 
one another? Since they were designed 
initially to detect differences between 
traumatized and normal Ss rather than 
among normals, the intercorrelations 
might be expected to differ for the two 
kinds of Ss. They are therefore pre- 
sented separately, as well as for the total 
group, in Table 13. As pointed out 
earlier, all tests were not given to all Ss. 
The traumatized cases included in the 
correlations are all those Ss who had 


TABLE 12 


PERCENTAGE OF SUBJECTS IDENTIFIED AS ABNORMAL BY SCORES BELOW THE CUTTING POINT 
ON THE Day OF PooREST PERFORMANCE 


Test Normal 


Total 
Traumatized 


Mild 
Trauma 


Moderate 
Trauma 


Severe 
Trauma 


Pain 

Maturation Scale 
Vision Scale 
Irritability 
Tension 


42 43 
25 (33)* 
41 (17) 
28 (33) 
34 (33) 


Any one or more tests 4 


51 46 


*Percentages in parentheses are based on an N of less than ten. 


I 30 57 
° 12 (50) 
I 31 60 
I 8 46 
3 23 46 
37 84 


NORMAL AND TRAUMATIZED NEWBORNS 


TABLE 13 
INTERCORRELATIONS AMONG TEST PROCEDURES 


Normal Ss 
Test 


Traumatized Ss Total group 


Pain-Mat. 
Pain-Vision 
Pain-Irrit. 
Pain-Tens. 
Mat.-Vision 
Mat.-Irrit. 
Mat.-Tens. 
Vision-Irrit. 
Vision-Tens. 
Irrit.-Tens. 


scores on at least two tests. The 46 nor- 
mal cases are the total number of Ss 
to whom all five tests were given. Stand- 
ard scores were used and the direction 
of scores changed so that all correlations 
may be similarly interpreted. 

Table 13 shows that the intercorrela- 
tions are, in fact, different for the two 
groups. The Maturation and Vision 
Scales, both presumably measuring de- 
velopmental level and having consider- 
able spread in both groups, are posi- 
tively correlated with one another, al- 
though the correlation is not statistically 
significant in the traumatized group. As 
would be expected, Irritability and Ten- 
sion are positively correlated in the trau- 
matized group but not in the normal. 
They are measures designed to detect 
deviations in the direction of abnor- 
mality and therefore have little variabil- 
ity in the normal group. Irritability, a 
measure of abnormality, and Vision, a 
measure of developmental level, are sig- 
nificantly related only in the trau- 
matized subjects. 

The finding of a correlation among 
normal Ss between Pain, which does 
have spread within the normal range, 
and Irritability, which does not, is some- 
what surprising. One can only suggest 


that the irritable infant is responding 
to many stimuli in a diffuse fashion and 
therefore tends to be less sensitive or less 
set for responding to a specific stimulus 
in a specific way. This same analysis 
might be expected to hold true among 
traumatized Ss except for the fact that 
some traumatized infants become more 
irritable while others become obtunded. 
The obtunded infant is scored as nor- 
mal in Irritability but will, of course, be 
relatively insensitive to pain as well as 
to other stimuli. 

Perhaps the most significant finding is 
that all intercorrelations are low. This 
is understandable since trauma to the 
newborn may be manifested in a variety 
of ways, some of which are incompatible 
with one another, as, for example, ob- 
tundity and hyperirritability. Such rela- 
tively low intercorrelations among the 
different tests, together with the ade- 
quate discriminating ability of each con- 
sidered separately, points to the advis- 
ability of using a combined score on all 
procedures as an impairment index. © 
This would achieve one of the character- 
istics desirable in constructing a test bat. 
tery—i.e., low intercorrelations among 
the tests and high correlations with the 
criterion. 


29 

N r N r N r 

46 -14 26 — .04 72 .06 

46 —.11 40 — .03 86 .06 

46 42 .08 88 .07 

46 .03 41 .16 87 

46 aoe 22 68 

46 .09 27 — .02 73 

46 .O1 26 .20 72 .20 

46 —.24 30 76 

“46 —.17 30 76 

46 42 &8 


30 GRAHAM, MATARAZZO, AND CALDWELL 


Another possible way of improving 
predictive accuracy would be to consider 
the length of time that an S’s perform- 
ance remains abnormal. In _ retesting 
traumatized Ss, we observed that the 
length of time scores remained abnormal 
varied considerably from infant to in- 
fant. The duration of an abnormal per- 
formance thus provides another dimen- 
sion along which to measure the in- 
fant’s response to trauma and _ offers 
promise as an additional way of identify- 
ing those Ss on whom trauma will leave 
a permanent imprint. 


Discussion 


The group of tests we have used sam- 
ples much of the repertoire of an infant’s 
response to his environment. All of the 
responses are relatively simple, but they 
represent a substantial portion of the 
most complicated behavior which an in- 
fant of this age can show. How com- 
plicated is such behavior? With the ex- 
ception of the two rating scales, the 
tests can be described as measuring sen- 
sorimotor ability, i.e., (a) the capacity to 
respond at all to various kinds of sensory 
stimuli, and (b) the extent to which the 
response is specific to a particular stimu- 
lus. The ratings of irritability and of 
muscular tension provide two more di- 
mensions along which all responses of an 
infant, both spontaneous and elicited, 
may be described. We should like to 
know whether measuring such behavior 
gives any information about either past 
or future development. 

Does sensorimotor functioning reflect the de- 
velopmental level of the nervous system? Gesell 
(6) has carried out extensive studies of prema- 
ture and full term newborns which suggest that 
simple sensorimotor abilities are in the process 
of development during the last months of fetal 
life and therefore should provide measures of 


the stage of development of the nervous system. 
This pioneer work needs to be confirmed, how- 


ever, by studies using standardized procedures 
whose reliability can be established. The diffi- 
culty we experienced in adapting from Gesell a 
reliable maturational scale and the widely di- 
vergent placement of similar items on various 
infant scales indicate that slight variations in 
method can produce large differences in results. 

Does sensorimotor functioning at birth predict 
later development? Efforts to answer this ques- 
tion have extended over several decades, but it is 
still not possible to formulate definite conclu- 
sions. This is partly because similar data have 
been interpreted by different standards. A cor- 
relation of a given size does not arouse the same 
response in all psychologists. The data, however, 
are also conflicting. Some studies have shown 
small negative correlations between performance 
in early infancy and several years later (10, p. 
637f). On the other hand, sizable positive cor- 
relations have been reported between tests at 
six months and tests as late as three years (14). 
Such factors as inadequately standardized tests, 
difficulties in controlling the infant’s momen- 
tary state (3), differences in the kind of infant 
tests, and differences in the kind of development 
which is being predicted all contribute to the 
confusion. 

Irwin (g) has questioned the reasonableness 
of expecting to predict “intelligent” behavior 
from neonatal response under even the most 
favorable conditions and with sensitive tests. 
Neurophysiological work suggests that the new- 
born is not capable of cortical functioning (4, 
p- 60f). While there is opposition to this view, 
the argument is mainly whether there is no 
cortical activity or some minimal amount (11). 
Irwin’s (g) position is essentially that you can- 
not expect to predict the future functioning 
of the cerebral cortex on the basis of tests made 
at a time when the cortex is nonfunctional. On 
logical grounds alone, this is not a necessary 
conclusion. There is no a priori reason why 
the functioning of subcortical structures should 
be uncorrelated with later cortical development. 
From Hebb’s (8, p. 109ff) interesting hypothesis 
that the higher organism is the slower learner 
initially, one might even infer that there is a 
negative correlation. Practically, however, the 
greater the gap between phenomena observed 
and those to be predicted, the more difficult it 
is likely to be to establish relationships. 

The present work is not primarily concerned 
with predicting the relative superiority of “nor- 
mal” individuals, but rather in determining 
whether external trauma has caused brain in- 
jury. We did find that a considerable percent- 
age of traumatized infants show impaired func- 
tioning as compared with nontraumatized new- 
borns, and that such impairment is related to 
clinical judgments of severity of trauma. But 


NORMAL 


will measures of impairment of a newborn pre- 
dict the extent of later impairment? The ques- 
tion cannot be answered at the present time. 
One can say only that it seems reasonable to 
assume that the greater the present trauma, 
the greater the likelihood that some cells will 
suffer irreversible damage. Since the cerebral 
cortex of the infant is relatively nonfunctional, 
is it likely that trauma to the cortex could be 
detected by newborn tests? The answer would 
seem to be no, if the cortex is, in fact, com- 
pletely nonfunctional and if the trauma were 
limited only to the cortex. Destruction of a 
part of the nervous system before it is mature 
enough to function will not give rise to the 
expected symptoms until “the time arrives for 
that structure to play its proper role” (4, p. 69). 
However, many kinds of trauma, and especially 
anoxia, may produce diffuse, multiple lesions in 
both cortical and subcortical structures (1). With 
such trauma, the extent of disruption of sub- 
cortical functions might provide an index of the 
degree of total damage. 

We suggested that a general impairment index 
could be computed by combining the number 
of tests showing impairment and the length of 
time impairment persists. The extent to which 
such an index will be valuable depends on 
whether or not specific areas of damage are as- 
sociated with impairment on specific tests. With- 
out discarding the possibility of specificity en- 
tirely, we would relegate it to a minor role. 
Functioning of a newborn is relatively undif- 
ferentiated and it seems likely that the extent 
of trauma would be the most important factor 
in determining subsequent impairment. This 
could be true either if sizable injuries in any 
area affect most functions, or if there is a tend- 
ency for the kinds of trauma to which a new- 
born is exposed to cause injury in the same 
areas because these are more susceptible to in- 
sult. 

It would be interesting to know more about 
the nature of the impairment that occurs. Be- 
cause the intensity of the pain stimulus was 
systematically varied, we were able to observe 
that in infants whose pain thresholds were 
abnormally high, the ability to perceive the 
stimulus did not seem impaired, since many Ss 
cried or gave other general responses to weak 
stimuli. Similarly, motor pathways were appar- 
ently intact, since the required leg movements 
could occur independently of stimulation. The 
difficulty, then, must lie in a failure to integrate 
stimulus and response in such a way that a 
response appropriate or specific to the stimulus 
could occur. In the development of the nervous 
system, sensory and motor nerves are in con- 
tact with their respective organs before it is 
possible for an excitation to pass from sensory 


AND TRAUMATIZED NEWBORNS 31 


to motor mechanism. We may speculate that, 
in the loss of the specific connection between 
stimulus and response under conditions of 
trauma, we see an example of the phenomenon 
observed in adult brain-damaged individuals, 
namely, that those abilities most recently ac- 
quired are most readily lost. 

One other phenomenon deserves mention. 
Some of the traumatized infants might be de- 
scribed as hyperreactive with increased muscular 
tension and irritability. These infants were 
sensitive to any kind of mild stimulation but 
gave generalized rather than specific responses. 
Other traumatized infants showed diminished 
general activity, were flaccid and apathetic. De- 
viations in either direction could impair per- 
formance on various tests. The direction of de- 
viation was not taken directly into account in 
the scoring, although apathetic infants were 
more likely to be penalized on the Maturation 
Scale. Are these two kinds of impairment re- 
lated to the severity of the trauma? Is it pos- 
sible that they are analogous to stages in the 
development of coma in adults? Initially, “in 
anesthesia and in coma induced by changes in 
the internal environment (decrease in the oxygen 
tension or in the blood sugar level) the cortex 
and brain stem suffer opposite changes in ex- 
citability” (5, p. 212). While cortical activity 
is diminished, there is increased reactivity of the 
brain stem. As coma deepens, brain-stem activ- 
ity likewise decreases (5, p. 217). 

We have been interested in differentiating 
those infants whose trauma is contemporaneous 
with the examination. What can be said of 
the possibility of detecting injury which has 
occurred some time prior to birth? This will 
depend on the extent to which our measures 
reflect temporary effects due to a present dis- 
turbance of the nervous system. The newborn 
traumatized early in intrauterine life should 
not appear irritable or obtunded, and if the 
damage suffered has been restricted to the cortex 
there might be no behavioral manifestations at 
birth. On the other hand, if subcortical cen- 
ters have been damaged, we might expect im- 
pairment of sensorimotor functioning. We had 
the opportunity of testing three mongols—one 
of the few kinds of fetal injury which, because 
of the physical peculiarities, can be detected 
at birth. All three did show impairment on 
Vision and Maturation Scales and also showed 
the marked muscular flaccidity which character- 
izes this condition. Their pain thresholds, how- 
ever, were normal and, except for one infant 
with an intestinal infection, they were not ir- 
ritable. 

In concluding, an encouraging word ought to 
be said for the value of infants as psychological 
Ss. In the last decade, they have been largely 


32 GRAHAM, MATARAZZO, AND CALDWELL 


overlooked, yet their very youngness offers 
unique advantages in psychological areas where 
the complexity of social and other environ- 
mental influences makes it difficult to disentangle 
the relevant variables. There is ample room 
for improvement in techniques of measurement. 
Our relatively crude efforts at objectification 
have been rewarding and have suggested many 
lines which might profitably be followed with 
more precise methods. 

We aimed to develop a short battery of tests 
requiring minimal equipment in order that a 
large number of Ss could be studied quickly. Be- 
cause of these requirements, some promising 
methods were dropped which ought to be in- 
vestigated more fully. With photographic equip- 
ment, for example, thresholds could be obtained 
for the pupillary response to light. Electronic 
equipment is now available that would permit 
measurement of several aspects of infant activity 
level. Early work by Richter (15) and Wenger 
(17) related the height of the skin resistance 
level to states of tension and wakefulness, In- 
dividual differences in hydration and in tough- 
ness of the skin present serious problems, but 
the desirability of obtaining an independent and 
objective measure of the state of an infant is 
great. 

The most conspicuous omission from our bat- 
tery is a measure of learning. A series of early 
studies attempting to establish classical condi- 
tioning in the newborn were disappointing (13, 
p. 376f) but these studies did not exhaust the 
possibilities of studying the conditions for and 
kinds of changes in behavior which newborns 
can exhibit. We found that, except for the rat- 
ing scales, performance on our tests improved 
with age during the first five days of life. In 
the case of pain thresholds, the improvement 
was linearly related to age change, as can be 
seen in Figure 2. Is this improvement due to 
maturation, to experience, to recovery from the 


nN 


Pain Threshold in Volts 


6 2 3 4 5 
hr. Days 


Fic. 2. Mean pain thresholds of normal subjects 
on the first five davs of life. 


trauma of birth, or to a combination of these 
factors? 

Without raising the question of whether the 
improvement is due to maturation or learning, 
one may ask what it is that has improved. In the 
case of the pain thresholds, for example, has 
the infant become more sensitive to pain as he 
grows older, or does the age change represent 
increased ability to give a differentiated re- 
sponse? Our results cannot separate the two 
possibilities. It would be interesting to de- 
termine whether the age changes are still pres- 
ent when any kind of response to the stimulus 
is recorded, i.e., when a generalized rather 
than a differentiated response is measured. If 
there are no age changes in the ability to make 
a generalized response, then the hypothesis of 
increased sensitivity could be rejected. This pro- 
cedure was actually used in two previous studies, 
one by Dockeray (2) in 1934, and one by Sher- 
man, Sherman, and Flory (16) in 1936. Unfor- 
tunately, their results disagree, which is perhaps 
to be expected since only very crude control of 
stimulus intensity was possible at that time. 
Repetition of this experiment with modern 
methods of stimulus control should be able to 
provide a more conclusive answer. 


SUMMARY 


Five test procedures, described in the 
preceding paper (7), were administered 
to 265 infants without prenatal, perina- 
tal, or postnatal complications and to 81 
infants suffering from anoxia, mechani- 
cal birth injury, or diseases or infections 
associated with brain damage. The trau- 
matized newborns composed nearly the 
total population of such infants born at 
the St. Louis Maternity Hospital during 
a two-year period. 

The five tests consist of a Pain Thresh- 
old Test, a Maturation Scale, a Vision 
Scale, an Irritability Rating and a Mus- 
cle Tension Rating. Reliability of the 
procedures was variously measured by 
split-half correlation, test-retest agree- 
ment, and interscorer agreement as ap- 
plicable. All of the tests appeared to be 
satisfactorily reliable. A sample of 109 
test scores obtained without knowledge 
of Ss’ classification did not differ signi- 
ficantly from those obtained under the 
usual conditions of partial knowledge. 


NORMAL AND TRAUMATIZED 


Norms have been presented for each 
test, with separate norms provided for 
each of the first five days of life and for 
Negro and white Ss where these variables 
were related to performance. Older Ss 
were found to be more sensitive than 
younger on the Pain Threshold Test and 
to perform better on the Maturation and 
Vision Scales. Negro Ss were superior 
to whites on both the Maturation and 
Vision Scales, but there was no race dif- 
ference on the other tests. Private-clinic 
status and sex of the S did not measur- 
ably affect performance. There was no 
practice or learning effect from retesting 
when age was held constant. 

Normal traumatized groups, 


paired for relevant variables, obtained 
significantly different scores on all tests. 


NEWBORNS 33 


When a cutting point at the poorer ex- 
treme of the normal distribution was 
selected, all tests identified some trau- 
matized Ss as abnormal while false posi- 
tives ranged only from 1 to 3 per cent. 
The percentage identified as abnormal 
increased with the seriousness, as rated 
by pediatric judges, of the trauma. Since 
intercorrelations of the five tests were 
low, a combined abnormality score was 
tentatively recommended. 

The question of whether newborn be- 
havior can predict either past or future 
development of the infant was discussed. 
Cautious consideration was also given 
to the relationship between the present 
findings and neuro-physiological knowl- 
edge about functioning of the newborn 
brain. 


REFERENCES 


. CourviLte, C. B. Cerebral anoxia. Los An- 
geles: San Lucas, 1953. 

. Dockeray, F. C., & RicF, CHARLOTTE, Re- 
sponse of newborn infants to pain stimula- 
tion, Ohio State Univer. Studies. Contrib. 
Psychol., 1934, No. 12, 82-93. 

. ESCALONA, SIBYLLE. The use of infant tests 
for predictive purposes. Bull. Menninger 
Clin., 1950, 14, 117-128. 

. Forp, F. R. Diseases of the nervous system 
in infancy, childhood and adolescence. 
(grd_ Ed.) Springfield, Charles C 
Thomas, 1952. 

. GELLHORN, E. Physiological foundations of 
neurology and psychiatry. Minneapolis: 
Univer. of Minnesota Press, 1953. 

. GESELL, A., & ARMATRUDA, CATHERINE, De- 
velopmental diagnosis. New York: Hoeber, 
1941. 

. GRAHAM, Frances K. Behavioral differences 
beween normal and traumatized newborns. 
I. The test procedures. Psychol. Monogr., 
1956, 70, No. 20 (Whole No. 427). 

. Hess, D. O. The organization of behavior. 
New York: Wiley, 1949. 

. IRwIN, O. C. Can infants have 1Q’s. Psychol. 
Rev., 1942, 49, 69-79. 

. Jones, H. E. The environment and mental 
development. In L. Carmichael (Ed.) 
Manual of child psychology. (2nd Ed.) New 
York: Wiley, 1954. 


11. KLeIrMAN, N. The role of the cerebral cor- 
tex in the development and maintenance 
of consciousness. In H. A. Abramson (Ed.) 
Problems of consciousness. New York: Jo- 
siah Macy Jr. Foundation, 1955. 

. MEEHL, P. E., & Rosen, A. Antecedent prob- 
ability and the efficiency of psychometric 
signs, patterns, or cutting scores. Psychol. 
Bull., 1955, §2, 194-216. 

. Munn, N. L. Learning in children. In L. 
Carmichael (Ed.) Manual of child psychol- 
ogy. (2nd Ed.) New York: Wiley, 1954. 

. NELSON, VIRGINIA L., & RICHARDS, T. W. 
Studies in mental development. I. Per- 
formance on Gesell items at six months 
and its predictive value for performance 
on mental tests at two and three years. 
J. genet. Psychol., 1938, 52, 303-325. 

. Ricnter, C. High electrical resistance of the 
skin of newborn infants and its signifi- 
cance. Amer. J. Dis. Children, 1930, 40, 
18-26, 

. SHERMAN, M., SHERMAN, IRENF, & FLORY, 
C. D. Infant behavior. Comp. Psychol. 
Monogr., 1936, 12, No. 4. 

. WENGER, M. A., & IRWIN, O. C. Fluctuations 
in skin resistance of infants and adults 
and their relation to muscular processes. 
Univer. Iowa Stud. Child Welf., 1936, 12, 
141-179. 


(Accepted for publication December 19, 1955) 


. 
4 


H 


‘ 
’ 
F 


GEORGE BANTA COMPANY, INC., MENASHA, WISCONSIN 


